High-Specificity Targeted Functional Profiling in Microbial Communities with ShortBRED
نویسندگان
چکیده
Profiling microbial community function from metagenomic sequencing data remains a computationally challenging problem. Mapping millions of DNA reads from such samples to reference protein databases requires long run-times, and short read lengths can result in spurious hits to unrelated proteins (loss of specificity). We developed ShortBRED (Short, Better Representative Extract Dataset) to address these challenges, facilitating fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences ("markers") and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. After evaluating ShortBRED on synthetic data, we applied it to profile antibiotic resistance protein families in the gut microbiomes of individuals from the United States, China, Malawi, and Venezuela. Our results support antibiotic resistance as a core function in the human gut microbiome, with tetracycline-resistant ribosomal protection proteins and Class A beta-lactamases being the most widely distributed resistance mechanisms worldwide. ShortBRED markers are applicable to other homology-based search tasks, which we demonstrate here by identifying phylogenetic signatures of antibiotic resistance across more than 3,000 microbial isolate genomes. ShortBRED can be applied to profile a wide variety of protein families of interest; the software, source code, and documentation are available for download at http://huttenhower.sph.harvard.edu/shortbred.
منابع مشابه
Development of an environmental functional gene microarray for soil microbial communities.
Functional attributes of microbial communities are difficult to study, and most current techniques rely on DNA- and rRNA-based profiling of taxa and genes, including microarrays containing sequences of known microorganisms. To quantify gene expression in environmental samples in a culture-independent manner, we constructed an environmental functional gene microarray (E-FGA) consisting of 13,056...
متن کاملHiSpOD: probe design for functional DNA microarrays
MOTIVATION The use of DNA microarrays allows the monitoring of the extreme microbial diversity encountered in complex samples like environmental ones as well as that of their functional capacities. However, no probe design software currently available is adapted to easily design efficient and explorative probes for functional gene arrays. RESULTS We present a new efficient functional microarr...
متن کاملDevelopment and application of functional gene arrays for microbial community analysis
Functional gene markers can provide important information about functional gene diversity and potential activity of microbial communities. Although microarray technology has been successfully applied to study gene expression for pure cultures, simple, and artificial microbial communities, adapting such a technology to analyze complex microbial communities still presents a lot of challenges in t...
متن کاملOligonucleotide microarray methodology for taxonomic and functional monitoring of microbial community composition
Microarray analysis is a cultivation-independent, high-throughput technology that can be used for direct and simultaneous identification of microorganisms in complex environmental samples. This review summarizes current methodologies for oligonucleotide microarrays used in microbial ecology. it deals with probe design, microarray manufacturing, sample preparation and labeling, and data handling...
متن کاملTransient dynamics of competitive exclusion in microbial communities.
Microbial metabolism drives our planet's biogeochemistry and plays a central role in industrial processes. Molecular profiling in bioreactors has revealed that microbial community composition can be highly variable while maintaining constant functional performance. Furthermore, following perturbation bioreactor performance typically recovers rapidly, while community composition slowly returns t...
متن کامل